Popular iterative algorithms such as boosting methods and coordinate descent on linear models converge to the maximum $\ell_1$-margin classifier, a.k.a. sparse hard-margin SVM, in high dimensional regimes where the data is linearly separable. Previous works consistently show that many estimators relying on the $\ell_1$-norm achieve improved statistical rates for hard sparse ground truths. We show that surprisingly, this adaptivity does not apply to the maximum $\ell_1$-margin classifier for a standard discriminative setting. In particular, for the noiseless setting, we prove tight upper and lower bounds for the prediction error that match existing rates of order $\frac{\|\wgt\|_1^{2/3}}{n^{1/3}}$ for general ground truths. To complete the picture, we show that when interpolating noisy observations, the error vanishes at a rate of order $\frac{1}{\sqrt{\log(d/n)}}$. We are therefore first to show benign overfitting for the maximum $\ell_1$-margin classifier.
translated by 谷歌翻译
It is widely believed that given the same labeling budget, active learning algorithms like uncertainty sampling achieve better predictive performance than passive learning (i.e. uniform sampling), albeit at a higher computational cost. Recent empirical evidence suggests that this added cost might be in vain, as uncertainty sampling can sometimes perform even worse than passive learning. While existing works offer different explanations in the low-dimensional regime, this paper shows that the underlying mechanism is entirely different in high dimensions: we prove for logistic regression that passive learning outperforms uncertainty sampling even for noiseless data and when using the uncertainty of the Bayes optimal classifier. Insights from our proof indicate that this high-dimensional phenomenon is exacerbated when the separation between the classes is small. We corroborate this intuition with experiments on 20 high-dimensional datasets spanning a diverse range of applications, from finance and histology to chemistry and computer vision.
translated by 谷歌翻译
随着机器学习算法在关键决策过程中的敏感数据上部署,它们也是私人和公平的越来越重要的。在本文中,我们表明,当数据具有长尾结构时,不可能构建既私有的学习算法,又无法对少数族裔亚人群产生更高的准确性。我们进一步表明,即使有严格的隐私要求,放松的整体准确性也会导致良好的公平性。为了证实我们在实践中的理论结果,我们使用各种综合,视觉〜(\ cifar和celeba)以及表格〜(法学院)数据集和学习算法提供了一组广泛的实验结果。
translated by 谷歌翻译
在关键安全应用中,当没有可解释的解释时,从业者不愿信任神经网络。许多尝试提供此类解释的尝试围绕基于像素的属性或使用先前已知的概念。在本文中,我们旨在通过证明\ emph {高级,以前未知的地面概念}来提供解释。为此,我们提出了一个概率建模框架来得出(c)插入(l)收入和(p)rediction(clap) - 基于VAE的分类器,该分类器使用可视上可解释的概念作为简单分类器的预测指标。假设是基本概念的生成模型,我们证明拍手能够在达到最佳分类精度的同时识别它们。我们对合成数据集的实验验证了拍手确定合成数据集的不同基础真相概念,并在医疗胸部X射线数据集上产生有希望的结果。
translated by 谷歌翻译
我们提供匹配的Under $ \ sigma ^ 2 / \ log(d / n)$的匹配的上下界限为最低$ \ ell_1 $ -norm插值器,a.k.a.基础追踪。我们的结果紧紧达到可忽略的术语,而且是第一个暗示噪声最小范围内插值的渐近一致性,因为各向同性特征和稀疏的地面真理。我们的工作对最低$ \ ell_2 $ -norm插值的“良性接收”进行了补充文献,其中才能在特征有效地低维时实现渐近一致性。
translated by 谷歌翻译
许多最近的作品表明,过度分辨率隐含地降低了MIN-NORM Interpolator和Max-Maxifiers的方差。这些调查结果表明,RIDGE正则化在高维度下具有消失的益处。我们通过表明,即使在没有噪声的情况下,避免通过脊正则化的插值可以显着提高泛化。我们证明了这种现象,用于线性回归和分类的强大风险,因此提供了强大的过度装备的第一个理论结果。
translated by 谷歌翻译
This report summarizes the work carried out by the authors during the Twelfth Montreal Industrial Problem Solving Workshop, held at Universit\'e de Montr\'eal in August 2022. The team tackled a problem submitted by CBC/Radio-Canada on the theme of Automatic Text Simplification (ATS).
translated by 谷歌翻译
可变形的物体操纵(DOM)是机器人中的新兴研究问题。操纵可变形对象的能力赋予具有更高自主权的机器人,并承诺在工业,服务和医疗领域中的新应用。然而,与刚性物体操纵相比,可变形物体的操纵相当复杂,并且仍然是开放的研究问题。解决DOM挑战在机器人学的几乎各个方面,即硬件设计,传感,(变形)建模,规划和控制的挑战突破。在本文中,我们审查了最近的进步,并在考虑每个子场中的变形时突出主要挑战。我们论文的特殊焦点在于讨论这些挑战并提出未来的研究方向。
translated by 谷歌翻译
使机器人能够靠近人类工作,需要一个控制框架,该框架不仅包括用于自主和协调的交互的多感官信息,而且还具有感知的任务计划,以确保适应性和灵活的协作行为。在这项研究中,提出了一种直观的任务堆叠(ISOT)制剂,通过考虑人臂姿势和任务进展来定义机器人的动作。该框架以visuo-tactive信息增强,以有效地了解协作环境,直观地在计划的子任务之间切换。来自深度摄像机的视觉反馈监视并估计物体的姿势和人臂姿势,而触觉数据提供勘探技能以检测和维持所需的触点以避免物体滑动。为了评估由人类和人机合作伙伴执行的所提出的框架,装配和拆卸任务的性能,有效性和可用性,使用不同的评估指标进行考虑和分析,方法适应,掌握校正,任务协调延迟,累积姿势偏差,以及任务重复性。
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译